Background update applied when mixing X11/Wayland and opening remote link
Categories
(Toolkit :: Application Update, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox94 | --- | fixed |
People
(Reporter: gerard-majax, Assigned: stransky)
References
(Depends on 1 open bug, Blocks 1 open bug)
Details
Attachments
(1 file)
I have recently switched to forcing Wayland on my system (Ubuntu/21.10) while previously Nightly was running over XWayland.
Since then, I have been unable to have links opening from third-party apps (IRC, Thunderbird, etc.) ; it always ends up with the error message stating that Firefox is already running.
STR:
MOZ_ENABLE_WAYLAND=1 firefox
- Click on a link in Thunderbird
Expected:
A new tab open
Actual:
Error stating Firefox is already running and is unresponsive.
It seems that a side effect of this is exposing bug 1480452 and making me seeing about:restartrequired
. Digging on that matter shows that when it is displayed, the platformBuildID
reported in about:support
is actually different from the platform.ini
one on disk, so we are not in the case of a false-positive mismatch. From discussion on Matrix,
agashlin> What you're describing seems consistent with this:
An update is ready, Firefox is running.
You try to open a URL from some other program. This starts Firefox, normally that will just pass the command line to the older instance very early and exit (remoting)
For some reason, the newly launched Firefox can't find or communicate with the older one in order to remote the command line, so it proceeds as if you'd asked it to start a new instance (e.g. with --no-remote)
The new instance applies the update and restarts itself (and again fails to remote)
The new instance tries to open the profile, which is already in use, so you see the "Firefox is already running" message.
My recent experiments seems to corroborate this: I have managed to avoid clicking on any link and while Nightly was showing the information of updates ready to be applied (and I think at least two have been shipped), I never got into the case of mismatching platformBuildID
. Then I purposedly clicked on a link, waited for the unresponsive error message to pop, and I could verify after that on-disk platformBuildID
did changed.
Reporter | ||
Updated•4 years ago
|
Comment 1•4 years ago
|
||
I think https://mastransky.wordpress.com/2020/03/16/wayland-x11-how-to-run-firefox-in-mixed-environment/ should help here.
Reporter | ||
Comment 2•4 years ago
|
||
Martin, I'm wondering if it might not be dupe of bug 1645038 nor bug 1634096. In both case, it seems they are stalling and people have a hard time repro'ing. It is 100% repro in my case, so I'd be happy to help there.
Reporter | ||
Updated•4 years ago
|
Reporter | ||
Comment 3•4 years ago
|
||
(In reply to Robert Mader [:rmader] from comment #1)
I think https://mastransky.wordpress.com/2020/03/16/wayland-x11-how-to-run-firefox-in-mixed-environment/ should help here.
It might, but:
- it's invasive, I dislike that (but so far I don't really see how we could fix that).
- to the best of my knowledge, https://bugzilla.mozilla.org/show_bug.cgi?id=1480452#c8 does not make any obvious link between being unable to find the Firefox instance and
about:restartrequired
being shown "too much".
Reporter | ||
Comment 4•4 years ago
|
||
(In reply to Robert Mader [:rmader] from comment #1)
I think https://mastransky.wordpress.com/2020/03/16/wayland-x11-how-to-run-firefox-in-mixed-environment/ should help here.
Forcing MOZ_DBUS_REMOTE=1
seems to indeed to the trick.
Comment 5•4 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #4)
Forcing
MOZ_DBUS_REMOTE=1
seems to indeed to the trick.
I wonder if we can make that the default, at least when running on Xwayland (or if we detect that dbus is availabe, or whatever the reason is why it's not used by default).
Reporter | ||
Comment 6•4 years ago
|
||
I guess Martin might know more why it's not the case ?
Reporter | ||
Comment 7•4 years ago
|
||
I've got some update pending, I can see it under the updates/0/
and so far after opening links from different apps, no about:restartrequired
yet.
Comment 8•4 years ago
|
||
I don't think this is related to updates. I see it every time when clicking on a link in a non-Wayland app (e.g. Signal) when running Wayland Firefox.
It might be related to the WMClass
stuff -- I had to tweak StartupWMClass
to fix start-up notifications.
Reporter | ||
Comment 9•4 years ago
|
||
(In reply to Laurențiu Nicola from comment #8)
I don't think this is related to updates. I see it every time when clicking on a link in a non-Wayland app (e.g. Signal) when running Wayland Firefox.
Please read carefully the first comment. It's not directly related to updates, but it can impact them.
Updated•4 years ago
|
Assignee | ||
Comment 10•4 years ago
|
||
When (In reply to Robert Mader [:rmader] from comment #5)
(In reply to Alexandre LISSY :gerard-majax from comment #4)
Forcing
MOZ_DBUS_REMOTE=1
seems to indeed to the trick.I wonder if we can make that the default, at least when running on Xwayland (or if we detect that dbus is availabe, or whatever the reason is why it's not used by default).
That's Bug 1677462.
Assignee | ||
Updated•4 years ago
|
Comment 11•4 years ago
|
||
I can reproduce this issue, and I have a workaround.
I originally saw this when I tried testing Firefox on Wayland by using MOZ_ENABLE_WAYLAND=1 firefox
. If I clicked a link from another application, I'd get the same dialog and the link wouldn't open.
When I switched to setting MOZ_ENABLE_WAYLAND=1
in my user environment for all programs, this no longer happened.
I could reproduce it again by unsetting MOZ_ENABLE_WAYLAND
and then running firefox https://example.org
; that produced the same dialog again.
Comment 12•4 years ago
|
||
I can reproduce this issue, and I have a workaround.
That works with Thunderbird, but it still happens to me when I click links in non-Wayland apps like Signal. I did set MOZ_ENABLE_WAYLAND
globally (in ~/.config/environment.d
on my DE).
Reporter | ||
Comment 13•4 years ago
|
||
The root cause is easy now that I found it: https://searchfox.org/mozilla-central/rev/49b6e60550243b4b4d71d6ab35a3ff2b9a9f7c69/toolkit/xre/nsAppRunner.cpp#4553-4615
This is where we will try to process pending updates. Right after that, we try to lock the profile.
So, what happens is:
- User is running with Wayland force-enabled on Firefox and other apps as X11, or otherwise ;
- Either way, e.g.,
xdg-open
is called in a way where it will try to find a remote instance for Firefox but will ultimately fail because of those differences of windowing systems ; - User has a pending update stored ;
- User tries to open such a remote link ;
- Since there is a pending update, https://searchfox.org/mozilla-central/rev/49b6e60550243b4b4d71d6ab35a3ff2b9a9f7c69/toolkit/xre/nsAppRunner.cpp#4553-4615 kicks in, applies the update successfully ;
- Then, very next in the code we try to acquire a lock on the profile at https://searchfox.org/mozilla-central/rev/49b6e60550243b4b4d71d6ab35a3ff2b9a9f7c69/toolkit/xre/nsAppRunner.cpp#4637-4679
- Since we have a running instance, we have a lock file, so this
LockProfile
will run and ultimately fail at https://searchfox.org/mozilla-central/rev/55e8eba74b60b92d04b781f7928f54ef76b13fa9/toolkit/xre/nsAppRunner.cpp#2399-2510
And from there, we are doomed:
- Update has been applied in the background of the running session,
- Next time a process needs to be created, we will hit
platformBuildID
mismatch and so actively present aabout:restartrequired
to the user
Reporter | ||
Updated•4 years ago
|
Reporter | ||
Comment 14•4 years ago
|
||
I'm not sure how much we should consider this is just another case of the first instanciation mechanism described in https://bugzilla.mozilla.org/show_bug.cgi?id=1480452#c8 ?
I'll let you decide whether we should keep this bug open to track this specific case or whether we can just add more infos to your existing description of the issue.
Comment 15•4 years ago
|
||
I have updates disabled (system-wide install on Linux) and I still get a "Firefox is already running, but is not responding" error. Since you updated the issue title to make it about updates, should I file another one?
Comment 16•4 years ago
|
||
The product::component has been changed since the backlog priority was decided, so we're resetting it.
For more information, please visit auto_nag documentation.
Reporter | ||
Comment 17•4 years ago
|
||
(In reply to Laurențiu Nicola from comment #15)
I have updates disabled (system-wide install on Linux) and I still get a "Firefox is already running, but is not responding" error. Since you updated the issue title to make it about updates, should I file another one?
If you are referring to the fact that the remote instance is not found when mixing X11/Wayland, it's already filed, as Martin said in comment 10.
Comment 18•4 years ago
|
||
I can confirm that MOZ_DBUS_REMOTE=1
fixes it, but the linked issue is WONTFIX.
Comment 19•4 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #0)
STR:
MOZ_ENABLE_WAYLAND=1 firefox
- Click on a link in Thunderbird
I guess this is a STR too:
MOZ_ENABLE_WAYLAND=1 firefox
firefox $url
So the question is should firefox try both remote methods? Actually, can we switch to dbus by default even on X11, and fallback to XRemote when that's not possible?
Updated•4 years ago
|
Comment 20•4 years ago
•
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #14)
I'm not sure how much we should consider this is just another case of the first instanciation mechanism described in https://bugzilla.mozilla.org/show_bug.cgi?id=1480452#c8 ?
Hmm, well the about:restartrequired
issue, as you have noted, is pretty clearly a result of Bug 1480452. It doesn't seem like this issue particularly affects the way that that bug needs to be fixed, though it does seem to represent a different way to end up in that situation.
As a bit of a side note, I think that I've got some time coming up to dig into that bug. It's going to be a huge job though, and I don't have a lot of resources to tackle it with. So I can't really promise any sort of timeline on making progress.
Even assuming, however, that we got Bug 1480452 fixed, it seems to me that it would mitigate this issue but not fix it. That is to say, you wouldn't see the about:restartrequired
page, but this original issue would, I believe, remain:
Expected:
A new tab openActual:
Error stating Firefox is already running and is unresponsive.
From what I have read in this bug, it sounds like the underlying issue here is this:
xdg-open
is called in a way where it will try to find a remote instance for Firefox but will ultimately fail because of those differences of windowing systems ;
It seems like fixing that would solve both problems. If remoting worked properly in this context, opening the link would result in Firefox opening, remoting into the existing instance of Firefox, and exiting before it had a chance to install updates.
I will, of course, continue to attempt to make progress on Bug 1480452. But I recommend that this issue be solved properly by fixing remoting rather than waiting for a mitigation that is still a long way off. It sounds like the hope was that Bug 1677462 would provide this. But given that it has been marked WONTFIX, perhaps something else should be pursued. I don't really have any suggestions there; that isn't really my area of expertise.
I'll let you decide whether we should keep this bug open to track this specific case or whether we can just add more infos to your existing description of the issue.
Given that this represents a new activation mechanism for Bug 1480452, I think it makes sense to have a bug open for it, like we have for Bug 1705217.
Assignee | ||
Comment 21•4 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #19)
(In reply to Alexandre LISSY :gerard-majax from comment #0)
STR:
MOZ_ENABLE_WAYLAND=1 firefox
- Click on a link in Thunderbird
I guess this is a STR too:
MOZ_ENABLE_WAYLAND=1 firefox
firefox $url
So the question is should firefox try both remote methods? Actually, can we switch to dbus by default even on X11, and fallback to XRemote when that's not possible?
It depends what you mean by 'that's not possible'. That may mean:
- fallback to X11 when DBus session interface is missing or DBus support is not built in.
- fallback to X11 when there isn't any Firefox listening on DBus interface, so we try X11.
The second case slows down Firefox start as you need to start and query two remotes. For instance Fedora uses DBus only on both X11 and Wayland as we don't want to slow down the start by testing various remotes.
Also I can't imagine a scenario where we want to fallback to X11 when we don't find an active DBus client - opening release instance (X11) from nightly (DBus) doesn't look correct to me.
IMHO the best solution may be https://phabricator.services.mozilla.com/D97146 - use DBus if there's possibility we're running on Wayland or use DBus always when Firefox is built with --enable-dbus (I don't have any preference here).
Assignee | ||
Comment 22•4 years ago
|
||
Updated•4 years ago
|
Comment 24•4 years ago
|
||
I would just like to mention that should it be possible to detect this situation, we could also make ShouldProcessUpdates avoid this issue.
Reporter | ||
Comment 25•4 years ago
|
||
(In reply to Nick Alexander :nalexander [he/him] from comment #24)
I would just like to mention that should it be possible to detect this situation, we could also make ShouldProcessUpdates avoid this issue.
From what I recall of some matrix discussion after comment 13, changing the behavior of ShouldProcessUpdates
was not the favorite option here.
Comment 26•4 years ago
|
||
Comment 28•4 years ago
|
||
Comment 29•4 years ago
|
||
Backed out for causing bustages on Linux x64 asan. CLOSED TREE
Backout link : https://hg.mozilla.org/integration/autoland/rev/36dfe8c27d727bdccfd51435d5dcb3f9c9711dbd
Push with failures : https://treeherder.mozilla.org/jobs?repo=autoland&resultStatus=testfailed%2Cbusted%2Cexception%2Crunnable&revision=117f94e1e765b809972408a2b4943c67e32b0455&searchStr=linux%2Cx64%2Casan&selectedTaskRun=ea2l9mnaQMKjt78ZLrCiXg.0
Failure log : https://treeherder.mozilla.org/logviewer?job_id=352698432&repo=autoland&lineNumber=70543
Assignee | ||
Comment 30•4 years ago
|
||
Updated, Thanks.
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=df4f2c738361e5794a14572aa30848be4ed0fb5d
Comment 31•4 years ago
|
||
Comment 32•4 years ago
|
||
bugherder |
Description
•